All-atom simulations predict how single point mutations promote serpin misfolding

نویسندگان

  • Fang Wang
  • Simone Orioli
  • Alan Ianeselli
  • Giovanni Spagnolli
  • Silvio a Beccara
  • Anne Gershenson
  • Pietro Faccioli
  • Patrick L. Wintrode
چکیده

Protein misfolding is implicated in many diseases, including the serpinopathies. For the canonical inhibitory serpin α1-antitrypsin (A1AT), mutations can result in protein deficiencies leading to lung disease and polymerization prone mutants can accumulate in hepatocytes leading to liver disease. Using all-atom simulations based on the recently developed Bias Functional algorithm we elucidate how wild-type A1AT folds and how the disease-associated S (Glu264Val) and Z (Glu342Lys) mutations lead to misfolding. The deleterious Z mutation disrupts folding at an early stage, while the relatively benign S mutant shows late stage minor misfolding. A number of suppressor mutations ameliorate the effects of the Z mutation and simulations on these mutants help to elucidate the relative roles of steric clashes and electrostatic interactions in Z misfolding. These results demonstrate a striking correlation between atomistic events and disease severity and shine light on the mechanisms driving chains away from their correct folding routes. Understanding how mutations alter protein misfolding propensities and the physico-chemical mechanisms underlying this shift is key to clarifying the molecular basis of many diseases. One set of relatively common protein misfolding diseases, the serpinopathies, arise when mutations in inhibitory serpins lead to misfolding, thus reducing the secreted levels of these important protease inhibitors1. Mutations in the canonical secretory serpin α1-antitrypsin (A1AT) result in the most common serpinopathies,theα1-antitrypsin (A1AT) deficiencies. In A1AT deficiencies, low circulating A1AT levels dysregulate leukocyte serine proteases resulting in lung disease which can be slowed but not halted by A1AT augmentation therapy2. Extremely pathogenic A1AT mutations, such as Z (Glu342Lys), can lead to both lung disease, due to loss of function, and liver disease, due to A1AT accumulation in the endoplasmic reticulum (ER) of hepatocytes, which generate most of the circulating A1AT. With the exception of liver transplants, there are no effective treatments for A1AT associated liver disease3. In vitro, the pathogenic A1AT Z mutant folds very slowly, spending hours in at least one partially folded intermediate state4. Similarly, Z secretion from cells is slow and, while some Z species are targeted for degradation5-7, misfolded Z accumulates in the ER where it can polymerize8. Despite numerous experimental studies9-13, little is known about the structure of misfolded species for any A1AT disease-associated mutant, hindering efforts to either rescue the folding of these species or to target them for degradation. Molecular dynamics (MD) simulations offer an attractive approach to study protein folding and misfolding, as they can in principle reveal folding pathways and intermediates in atomistic detail. To date the application of all-atom MD simulations to investigate protein folding and misfolding has been limited to small single-domain proteins, with relatively short folding times. In particular, recent developments such as the Anton special-purpose supercomputer14 and the massively distributed folding@home project15 have made it possible to generate in silico several reversible folding/unfolding events for a number of small globular proteins (< 100 amino acids) with folding times up to the ms range. These studies have demonstrated that current all-atom force fields in explicit solvent can lead to the correct native states of proteins and predict with good accuracy their folding kinetics. Unfortunately, most biologically relevant proteins are much larger than 100 amino acids and have folding times as long as seconds and beyond. In particular, A1AT and other serpins contain approximately 400 amino acids and fold over tens of minutes9,12,13. Due to their large size and slow folding kinetics, simulating serpin folding with conventional MD simulations is not feasible, even using the most powerful available supercomputers. In this work, we rely on a recently developed variational method called the Bias Functional (BF) approach (see Methods) to overcome this limitation. We use this scheme to characterize the folding and misfolding of wild-type (WT) A1AT, the pathological Z mutant and the relatively benign S (Glu264Val) mutant1, starting from several fully denatured configurations. The results of these allatom simulations provide testable, atomistic models for how WT A1AT folds, how disease associated mutants misfold and how suppressor mutations can rescue misfolding. These results also shine light on the connections between single-point mutations and the pathogenicity of misfolding prone proteins, and provide physical mechanisms responsible for misfolding phenotypes. Results and Discussion The BF approach. Using the BF algorithm16 it is feasible to use standard computer clusters to generate many folding and misfolding events for proteins as large as serpins, using state-of-the-art all-atom force fields. This algorithm identifies the most realistic reaction pathways within an ensemble of uncorrelated trial trajectories, generated by the so-called ratchet-and-pawl MD (rMD)17.18. In rMD, a biasing force is introduced only when the chain tends to back-track towards the unfolded state and it is not applied when the chain spontaneously progresses towards the native state. In the BF approach a rigorously derived variational condition is applied to identify the trial rMD pathways which have the largest probability to occur in the absence of any bias. In the following, we shall refer to these paths as Minimum Bias Trajectories (MBTs). The accuracy of all variational approaches critically depends on the quality of the model subspace, i.e. the set of possible solutions, within which the optimal approximate solution is identified through the variational condition. In the BF approach, the model subspace is defined by the trial folding trajectories generated using the rMD protocol. Since a suboptimal choice of the model subspace may introduce uncontrolled systematic errors, variational approaches must be carefully benchmarked against exact methods and extensively validated against experimental data. The BF method was benchmarked against the results of MD folding simulations, performed using the Anton supercomputer16,19, where both methods used the same all-atom force field. In particular, the folding mechanism and the precise order of native contact formation during the folding of an all-beta protein (WW domain Fip35) and of an all-alpha protein (villin headpiece subdomain) predicted by the BF method were found to be statistically indistinguishable from those obtained using conventional MD methods. The predictions of the BF approach have also been validated against experimental data. For example, the BF method was used to study the folding of two alpha proteins consisting of nearly 100 amino acids (IM7 and IM9), with highly homologous native structures, but very different folding kinetics20. The BF method correctly reproduced all of the observed kinetic features. In a very recent investigation, the BF approach has been interfaced with quantum electronic structure calculations to yield a direct prediction of the expected time-resolved near and far UV circular dichroism (CD) spectra in the folding of canine lysozyme. The BF results agree with the experimental data, correctly predicting the existence of a folding intermediate and reproducing the difference between the intermediate and native state CD spectra, both in the near UV, which reports on protein secondary structure, and in far UV, which reports on the local environment of aromatic residues, particularly tryptophans (PF, unpublished data). An early version of the BF algorithm (called dominant reaction pathway or DRP) was also successfully applied to sample the conformational transition leading to serpin latency21. These BF calculations correctly reproduce the effects of point mutations on the reaction kinetics and provide an atomistically detailed picture which explains why binding a specific small molecule accelerates the latency transition of the serpin plasminogen activator inhibitor (PAI-1). This study also delineates differences between the PAI-1 latency transition and that of A1AT to help explain why PAI-1 easily accesses the latent state while A1AT does not. In order to further explore serpin conformational changes and to validate the BF simulation results, we begin the present study by simulating and analyzing the folding of the WT A1AT. We compare the results of the simulations to available experimental results and provide new, testable atomistically resolved data on how WT A1AT folds. These WT simulations also provide the reference for comparing folding and misfolding pathways. We therefore proceed to simulate and analyze the folding of disease associated A1AT variants and suppressor mutants. WT A1AT folding pathways. For all of the A1AT variants, folding starts with the independent formation of local structures (Fig. 1A, stage 1) that we refer to as foldons, following the usage of Wolynes, Englander and coworkers22,23. In the majority of successful WT trajectories, foldons dock in a well-defined order, as determined from visual inspection, from the plot of the radius of gyration versus fraction of native contacts and from an automated statistical analysis that identifies the most relevant change-points24 (Fig. 1A). In WT MBTs, early native interactions are formed between residues at the top of strands 5/6A (s5/6A) in the nascent sheet A and the nascent B-C barrel formed by parts of strands 3C, 4C and 1 to 3B (Fig. 1A, stage 2). In particular, a hydrogen bond is formed between Glu342 and Thr203 and van-der-Waals interactions are established between Pro289 and Met residues 220 and 221. The structures corresponding to stages 1 (foldon formation) and 2 (early inter-foldon interactions) in Figure 1A were subjected to 200 ns of conventional MD simulations in explicit solvent. In these MD simulations, the C-terminal βhairpin formed by strands 4/5B is not stable consistent with the experimental finding that the isolated peptide containing the 36 C-terminal A1AT residues lacks stable structure in the absence of the rest of the protein25. In contrast, as shown in Fig. 2A, the three N-terminal foldons and the network of interactions between the nascent B-C barrel and s5/6A are stable on the time scale of these conventional MD simulations. In most WT trajectories (see Movie S1), non-native interactions between the loop at the top of strand 3A and β-strands 2 and 3B prevent correct positioning of the B-C barrel leading to the first barrier (Fig. 1A, stage 3). Disruption of this steric hindrance allows for the correct positioning of the B-C barrel and docking of s5/6A to s1-3A thus completing β-sheet A formation (Fig. 1A, stage 4). At this stage, the C-terminal βhairpin (s4/5B) and the N-terminal helices are solvent exposed and free to move on flexible linkers. Folding completes when the C-terminal hairpin docks to strands 1 to 3B (Fig. 1A, stage 5) followed by packing of the N-terminal helices and s6B on the back of the β sheets. Our finding that completion of sheet A precedes C-terminal hairpin incorporation into sheet B agrees with fragment complementation studies, where docking of a fragment containing the Cterminal hairpin with a larger N-terminal A1AT fragment requires the presence of s5A11 and thus, presumably, completion of sheet A. Similarly, in kinetic refolding experiments monitored by oxidative labeling of sidechains or hydrogen/deuterium exchange of backbone amides detected by mass spectrometry (MS), s5A is one of the last regions to acquire native-like protection12,13. These same MS based studies found that s4B remains solvent exposed until late in the folding process, consistent with the late packing of the C-terminal hairpin (s4/5B) seen in our simulations. WT A1AT contains two Trp residues, 238 in strand 2B, and 194, C-terminal to strand 3A in the "breach" region at the top of sheet A. Studies of A1AT single Trp mutants, in the WT background, show that during equilibrium unfolding, Trp194 is sensitive to the native to intermediate transition and that both Trp194 and Trp238 are sensitive to the intermediate to unfolded transition26. In addition, stopped flow studies of WT A1AT refolding monitored by Trp fluorescence emission reported at least three kinetic phases for A1AT WT refolding, a very fast, ~50 ms, phase; a slower, ~500 ms, phase and a very slow 100s of seconds long phase9. Consistent with these data, in the WT A1AT BF folding simulations, Trp238 is buried early, in stages 1 & 2, as the foldons and nascent B-C barrel form (Fig. 1B) while consolidation of the breach and Trp194 burial and quenching, presumably by Tyr244, occurs later, when sheet A fully forms (Fig. 1A stage 4 and Fig. 1B). Trp194 is fully buried in the second to last folding step when the C-terminal hairpin inserts into sheet B correctly positioning the RCL. Thus the BF folding simulations results are in qualitative agreement with results from A1AT folding experiments monitored by Trp fluorescence emission. While A1AT has only a single Cys residue and no disulfide bonds, folding studies of two serpins containing disulfide bonds, ovalbumin and antithrombin III, show that N-terminal disulfide bond formation and rearrangements occur after the C-terminus packs27,28. These experimental results imply that the N-terminus is involved in the final stage of serpin folding. The large conformational transition performed by the N-terminal foldon in the last stage of WT A1AT folding suggests that the last steps in A1AT, ovalbumin and antithrombin III folding are similar and provides a testable, atomistically detailed explanation for the experimental results. The BF WT A1AT folding simulations are in good agreement with existing serpin folding data and set the stage for generating detailed models of mutation induced serpin misfolding. Misfolding of A1AT disease-associated variants. BF folding simulations for the S (Glu264Val) and Z (Glu342Lys) mutants suggest that these two A1AT variants misfold differently. To effectively analyze the differences between WT, S and Z folding and misfolding for these nonequilibrium BF folding simulations we focused on the observation that mutants with a higher misfolding propensity are less likely to reach configurations in which almost all of the native contacts have been formed. To quantify this measure for each A1AT variant, we used all of the frames from all of the minimum biased trajectories (MBTs) generated for each A1AT variant to construct a histogram describing how often each variant visits configurations with a given fraction of native contacts, Q (Fig. 3A). Comparisons between such histograms for WT and Z simulations correctly predict that the pathological Z mutant is significantly less able to populate structures close to the native state. In fact, none of the Z mutant MBTs successfully reached the fully folded state. In contrast, the Q histograms for S and WT trajectories are similar, consistent with the observation that the S mutation has only mild effects on A1AT folding1 (Fig. 3A). The Z mutation disrupts a network of electrostatic and hydrogen-bonding interactions at the top of β-sheet A (Fig. 3B). Experimentally, Z misfolding can be rescued by mutating Lys290 to Glu29 leading to a reversed version of the original salt bridge (K290E/E342K or Glu/Lys) and reducing the probability of steric clashes. Consistent with experiment, our Glu/Lys simulations show a WT like distribution of native contacts (Fig. 3A). One way to test the relative roles of electrostatics and sterics in Z misfolding is to place Glu at both 342 and 290 (K290E/E342 or Glu/Glu) thus preserving the electrostatic repulsion present in Z but reducing the side chain length. Consistent with cellular studies, where Z and Glu/Glu shows approximately 20% and 75% secretion efficiency, respectively, relative to WT A1AT30, the distribution of native contacts in the Glu/Glu simulations resembles that of WT (Fig. 3A, Movie S2). The fact that both experiments and computations indicate that Glu/Glu can fold with reasonable efficiency suggests that the 290/342 salt bridge may not be essential for A1AT folding. To further test this hypothesis we determined whether replacing Lys290 with Ser could rescue Z (K290S/E342K or Ser/Lys) simply by alleviating steric clashes. The resulting MBTs are more WT-like than Z-like (Fig. 3A). Investigating the order of contact formation. Does misfolding of A1AT mutants occur because some key native contacts are formed in the wrong order? To answer this question, we performed a statistical analysis based on the distribution of path similarity s introduced in18 and defined in the Methods. This parameter is equal to 1 when native contacts in two reactive trajectories are formed in exactly the same order, whereas s=0 when the order is entirely different. For completely random sequences of native contact formation the similarity distribution is sharply peaked around 0.3. Path similarities calculated between MBTs for the same variant (self-similarity) or between trajectories of different variants (cross similarity) are shown in Fig 4. For any given A1AT variant the selfsimilarity distribution obtained from all of the MBTs peaks around 0.6 (Fig. 4A). In contrast, comparisons between different variants show that Z MBTs differ significantly from WT (peak s~0.4), while S and WT pathways are similar (peak s~0.6) (Fig. 4B). Path similarity analyses also indicate that while mutations at Lys290 may suppress Z misfolding, the paths are very different for different combinations of residues at positions 290 and 342. Folding of the Glu/Glu variant, where Lys290 is mutated to Glu in the WT background providing an anionic charge repulsion between residues 290 and 342 compared to the cationic repulsion in the Z mutant, is most WT-like. The folding of the Ser/Lys variant diverges the most from WT, while the order in which contacts are formed in Glu/Lys the charge-reversed salt-bridge in the Z background span a large range from WT-like to quite different (Fig. 4B). These results, and the finding that in the BF folding simulations all of these salt-bridge variants are more likely to fold than Z, suggest that A1AT can fold using a number of alternative folding pathways. Moreover, they show that the Lys290-Glu342 salt-bridge, while not required for folding, is important for increasing the probability of the major folding mechanism observed in the WT A1AT BF folding simulations. In S and WT trajectories, misfolding occurs late, resulting from premature docking of the Nterminal helices, which leaves the C-terminus solvent exposed (Fig. 5, Movie S3). Premature docking of the N-terminal helices to the rest of A1AT is essentially irreversible in the BF simulations due to the number of native contacts gained in these interactions. However, in experiments this misfolding may be reversible, and in cells interactions between the lectin chaperones and the N-terminal glycans at Asn residues 46 and 83 could help protect against this late misfolding. In striking contrast, Z and WT folding diverge much earlier (Movie S4). As discussed above, the Z mutation replaces a conserved salt bridge between Lys290 and Glu342 (Fig. 5) with a chargecharge repulsion. This mutation appears to disrupt early native and non-native interactions between β-strands 5/6A and the B-C barrel. Electrostatic repulsion and steric hindrance, due to the length of the two Lys sidechains (K290/E342K), increase the probability that these residues assume a non-native spatial orientation. In contrast to Z, Glu/Glu achieves WT like docking between s5/6A and the B-C barrel despite the electrostatic repulsion between the two Glu residues. This supports the hypothesis that sterics, in addition to electrostatics, play a key role in Z misfolding. In conclusion, our calculations provide a coherent atomistic and physics-based picture of serpin folding and misfolding. For WT A1AT, we find that there is a major folding pathway that begins with the initial assembly of local structural units, followed by higher order associations, some of which involve non-native contacts. The pathway ends with the incorporation of the C-terminal βhairpin followed by docking of the N-terminal helices. These findings are supported by existing experimental data9-13,26-30 and the detailed molecular mechanisms provided here are experimentally testable. The multi-dimensional free energy landscape for folding is complex and, as exemplified by the Z suppressor mutants, there are alternative ways to successfully fold to the native state. Our simulations of pathological and suppressor mutants also elucidate the mechanism of Z misfolding and make the prediction that misfolding occurs early in the folding process, a prediction that should be amenable to experimental testing. The scheme presented in this work opens a way to investigate many important disease-associated processes, which occur over minutes or hours, using only ordinary medium-sized computer clusters of the type available to most computational laboratories.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Serpin latency transition at atomic resolution.

Protease inhibition by serpins requires a large conformational transition from an active, metastable state to an inactive, stable state. Similar reactions can also occur in the absence of proteases, and these latency transitions take hours, making their time scales many orders of magnitude larger than are currently accessible using conventional molecular dynamics simulations. Using a variationa...

متن کامل

Modelface: an application programming interface (API) for homology modeling studies using Modeller software

An interactive application, Modelface, was presented for Modeller software based on windows platform. The application is able to run all steps of homology modeling including pdb to fasta generation, running clustal, model building and loop refinement. Other modules of modeler including energy calculation, energy minimization and the ability to make single point mutations in the PDB structures a...

متن کامل

Modelface: an application programming interface (API) for homology modeling studies using Modeller software

An interactive application, Modelface, was presented for Modeller software based on windows platform. The application is able to run all steps of homology modeling including pdb to fasta generation, running clustal, model building and loop refinement. Other modules of modeler including energy calculation, energy minimization and the ability to make single point mutations in the PDB structures a...

متن کامل

Fast Protein Translation Can Promote Co- and Posttranslational Folding of Misfolding-Prone Proteins.

Chemical kinetic modeling has previously been used to predict that fast-translating codons can enhance cotranslational protein folding by helping to avoid misfolded intermediates. Consistent with this prediction, protein aggregation in yeast and worms was observed to increase when translation was globally slowed down, possibly due to increased cotranslational misfolding. Observation of similar ...

متن کامل

Molecular dynamics simulations and docking enable to explore the biophysical factors controlling the yields of engineered nanobodies

Nanobodies (VHHs) have proved to be valuable substitutes of conventional antibodies for molecular recognition. Their small size represents a precious advantage for rational mutagenesis based on modelling. Here we address the problem of predicting how Camelidae nanobody sequences can tolerate mutations by developing a simulation protocol based on all-atom molecular dynamics and whole-molecule do...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017